Optimizing Performance And Sustainability For Ai Inference